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ABSTRACT 

We present a method of measuring galaxy power spectrum based on the 
multiresolution analysis of the discrete wavelet transformation (DWT). Besides 
the technical advantages of the computational feasibility for data sets with large 
volume and complex geometry, the DWT scale-by-scale decomposition provides 
a physical insight into the covariance matrix of the cosmic mass field. Since 
the DWT representation has strong capability of suppressing the off-diagonal 
components of the covariance for selfsimilar clustering, the DWT covariance for 
all popular models of the cold dark matter cosmogony generally is diagonal, 
or j(scale)-diagonal in the scale range, in which the second or higher order 
scale-scale correlations are weak. In this range, the DWT covariance gives a 
lossless estimation of the power spectrum, which is equal to the corresponding 
Fourier power spectrum banded with a logarithmical scaling. This DWT 
estimator is optimized in the sense that the spatial resolution is adaptive 
automatically to the perturbation wavelength to be studied. In the scale range, 
in which the scale-scale correlation is significant, the accuracy of a power 
spectrum detection depends on the scale-scale or band-band correlations. In 
this case, for a precision measurements of the power spectrum, or a precision 
confrontation of the observed power spectrum with models, a measurement of 
the scale-scale or band-band correlations is needed. We show that the DWT 
covariance can be employed to measuring both the band-power spectrum and 
second order scale-scale correlation. 

We also present the DWT algorithm of the binning and Poisson sampling 
with real observational data. We show that the so-called alias effect appeared 
in usual binning schemes can exactly be eliminated by the DWT binning. 
Since Poisson process possesses diagonal covariance in the DWT representation, 
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the Poisson sampling and selection effects on the power spectrum and second 
order scale-scale correlation detection are suppressed into minimum. Moreover, 
the effect of the non-Gaussian features of the Poisson sampling can also be 
calculated in this frame. The DWT method is open, i.e. one can add further 
DWT algorithms on the basic decomposition in order to estimate other effects 
on the power spectrum detection, such as non-Gaussian correlations and bias 
models. 

Subject headings: cosmology: theory - large-scale structure of the universe 
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1. 



Introduction 



Measuring the galaxy power spectrum has been and is being a central subject of the 
large scale structure study. Although the power spectrum is only a second order statistical 
measure of the deviations of a random field, <5(x), of mass density from homogeneity, 
it directly reflects the physical scales of the processes that affect structure formation. 
Mathematically, the positive definiteness of the power spectrum is useful for constraining 
the parameter space in comparing predictions with data. Since the ongoing and upcoming 
redshift surveys of galaxies will provide data of galaxy distribution with highly improved 
quality and a larger quantity, it also requests to develop the methods of measuring the 
power spectrum more precise and computationally efficient. 

Different methods of the power spectrum measurements adopt different representations, 
or decomposition of the covariance Cov = (<5(x)£(x')), where (...) stands for an ensemble 
average. For a representation given by a set of basis functions ipi(x) (sometimes referred as 
weight function), the random field is described by the variables 



and the covariance is given by Covij = (XiXj). If the covariance in this representation 
is exactly or approximately diagonalized, the diagonal elements (|Xj| 2 ) would be a fair 
estimate of the power spectrum, or band-power spectrum. Thus, measuring power spectrum 
mathematically is almost a synonym of diagonalizing the covariance of the density field 
5(x), or calculating the eigenvalues of the covariance matrix. 

Traditionally, the Fourier decomposition, and then, the Fourier power spectrum are 
the popular tool to analyze a cosmic density field, because the Fourier transform retains 
the translation invariance of a homogeneous and isotropic universe. However, the observed 
sample given by redshift surveys are not translation invariant due to the selection effect and 
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irregular geometry of the surveys. To effectively compare the predicted power spectrum 
with the observed galaxy distributions, the basis functions of the decomposition should 
be chosen to incorporate with the selection effect, sampling, and complex geometry of the 
data. As a result, various decompositions for measuring the galaxy power spectrum have 
been proposed (Tegmark, et al. 1998 and reference therein). An ideal estimator of the 
power spectrum should match the following conditions 

• XjS are independent from each other, i.e. the data is decomposed into mutually 
exclusive chunks; 

• XjS retains all the information of the original data, i.e. the decomposed chunks are 
collectively exhaustive; 

• It is computationally feasible; 

• It allows us to take account of the systematic effects, such as redshift distortion, 
evolution, morphology- dependence, galactic extinction etc. 

These ideal estimators are believed to be information lossless, i.e. retaining all information 
of the power spectrum in the original data. 

We will study, in this paper, the estimator based on the multiscale decomposition, 
i.e. the discrete wavelet transform (DWT) representation. The DWT power spectrum 
estimator has been applied to measure the power spectrum from samples of the Ly-a forests 
of QSO's absorption spectra (Pando & Fang 1998a.) The result has demonstrated that 
the DWT power spectrum estimator can match the conditions listed above, especially it 
is very helpful to overcome the difficulties of complex geometry and sampling. Within the 
framework of DWT, this paper will present a general working scheme for extracting the 
statistical characters from the observational data, in which the selection effect, sampling 
and binning are addressed. 
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It has been recognized recently that the non-Gaussian behavior of X { is substantial for a 
precise measurement of the power spectrum. The accuracy of a power spectrum estimation 
is significantly affected by the so-called power spectrum correlations induced by non-linear 
clustering (Meiksin & White, 1998, Scoccimarro, Zaldarriaga & Hui 1999). The power 
spectrum correlation is also found to be essential for recovering the initial power spectrum 
by a Gaussianization of observed distribution (Weinberg 1992, Narayanan & Weinberg 
1998, Feng & Fang 1999). Thus, beyond the conditions mentioned above for an ideal 
power spectrum estimator, one should add one more requirement that the power spectrum 
correlation caused by the non-linear clustering and Poisson sampling are calculable. We will 
show that the power spectrum correlations, or the scale-scale correlations, can be calculated 
in the DWT analysis. 

Moreover, for popular models of the cold dark matter cosmogony, including the 
standard cold dark matter models (SCDM), open CDM model (OCDM), and flat CDM 
(LCDM), the scale-scale correlations have been found to be negligible on large scales, and 
the non-local scale-scale correlations are also negligible even on small scales (Fang, Deng & 
Fang 2000). That is, the effect of the power spectrum correlations is largely suppressed in 
the DWT representation. We will show how to take the advantage of this suppression for a 
scale-by-scale approach of measuring the power spectrum. 

The paper will be organized as follows. §2 gives a brief description of the DWT 
decomposition of the covariance of density random field. The physical meaning and 
mathematical properties of the j diagonal and j off-diagonal components of the covariance 
will also be discussed. In §3, an optimized band power spectrum estimator based on the 
DWT j diagonal covariance is proposed. In addition, the scale-scale correlation extracting 
from the j off-diagonal components of the covariance is investigated. This correlation 
gives the scale range in which the power spectrum obtained by the j diagonalization 
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are information lossless. We then present the algorithm for estimating the DWT band 
power spectrum from observed galaxy catalog. It includes the DWT binning (§4), and the 
DWT technique of dealing with Poisson sampling and selection (§5). The discussions and 
conclusions are given in §6. A brief introduction of the DWT analysis is given in Appendix. 

2. Covariance of density fluctuations in the DWT representation 

2.1. DWT decomposition of density fields 

For the sake of simplicity, we analyze a 1-D density distribution p(x) in the range 
< x < L, which is assumed to be a stationary random field. The density contrast is 
defined by 5(x) = (p(x) — p)/p, where p = (p(x)), and (...) stands for ensemble average. It 
would be straightforward to extend the most results to 2-D and 3-D. Some specific problems 
related with higher dimension extension will be discussed in §6. In addition, the redshift 
distortion will not be taken into account in this paper. 

To ensure a multiscale decomposition of S(x) to be information-lossless, the natural 
working scheme is to adopt discrete wavelet transformation (DWT) within the framework of 
multiresolution analysis (MRA). The mathematical construction of MRA theory is briefly 
sketched in Appendix A. 

Let 5 p (x) be the periodic extension of S(x), i.e., S p (x) = 5(x — [x/L] ■ L), where [rj] 
denotes integer part of rj. From eq.(A36), the density contrast S p (x) can be decomposed in 
term of orthonormal wavelet basis 

oo +oo 

s p (x) = EE **iiM*), (2) 

j=0 l=— oo 

The wavelet function coefficient (WFC), e^, is given by the inner product of 
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which describes the density fluctuation on scale L/2 j at position lL/2 j . The WFCs are the 
variables of the random field in the DWT representation. The original distributions can be 
exactly and unredundantly reconstructed from these decomposed variables. 

By using the periodized wavelet function defined by 

/ 03 \ 1 / 2 oo 

^A x )=[l) E #y( z +n)-Z]. (4) 
where ip is the basic wavelet function [eq.(A21)], eq.(l) becomes 



oo V-l 



6 P (x) = E E hi^M (5) 

j=0 i=-q 



The WFC can then be computed by 

5 = l L S P (x)^(x)dx (6) 

We will always use the periodized functions below, and drop the superscript P. 

Furthermore, ipjj(x) is admissible [eq.(A27)], which implies that ipjj(x) has zero mean 
if it is integrable, 

JiP j , l (x)dx = 0. (7) 

It then follows from eq.(2) that 

(hi) = (8) 
The Fourier decomposition of the field 5(x) is given by 

oo 

S(x)= E 5 n e t27Tnx/L , (9) 

n=— oo 

where n is an integer, and the Fourier coefficients, 5 n , is 

S n = (n\S) = \ [ L 5(x)e- t2wnx/L dx, (10) 
L Jo 



Since both the bases of the Fourier transform and the DWT are orthogonal and 
complete in the space of 1-D functions with period length L, we have 

oo 2^-1 

EE(«lfe)fel«'> = t (ii) 

3=0 1=0 

where 8% n , is the Kronecker Delta function, and {n\ipjj) the Fourier transform of the wavelet 

i>u g iven b y 

4,(n) = (n\^) = f L ^e-^^dx. (12) 

J 

Considering the wavelet ipj,i(x) is related to the basic wavelet ip{rf) by eq.(All), eq.(12) can 
be rewritten as 

/9A _1/2 

hA")=\j) ^(n/2^)e- M / 2J , (13) 
where ^(n) is the Fourier transform of the basic wavelet 

$( n ) = [ L ^Me'^^dr]. (14) 
Jo 

Substituting expansion (9) into eq.(6) yields 

oo -l oo 

hi= E S n e^ L iJ hl (x)dx = ]T 6j jtl (-n). (15) 

n=— oo n=— oc 

Similarly, inserting expansion (5) into eq.(10) we have 

oo 2^-1 

S n = TEE ZjAM) ( 16 ) 
^ 3=0 1=0 

oo 2^-1 /i x 1/2 

= EE i^f) e^e-^^in/n n ± 0. 

Equations (15) and (16) show that both the Fourier variables 8 n and the DWT variables 
are complete. 

However, the statistical properties of the Fourier mode n and the DWT mode (j, I) 
are quite different. For a non-Gaussian field consisting of randomly homogeneously 
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distributed clumps with a non-Gaussian probability distribution function(PDF), the 
one-point distributions of the real and imaginary components of the Fourier modes 
could be still Gaussian. That is because the Fourier modes are subject to the central 
limit theorem of random fields (Adler 1981). Even though the non-Gaussian clumps are 
correlated, the central limit theorem still holds if the two-point correlation function of the 
clumps approaches zero fast sufficiently (Fan & Bardeen, 1995.) Thus, the non-Gaussian 
information could be lost in the Fourier representation if the phases of the Fourier 
coefficients are missing. 

On the other hand, the DWT basis doesn't suffer from the central limit theorem. 
A key condition necessary for the central limit theorem to hold is that the modulus of 
the decomposition basis are less than C/y/L, where L is the size of the sample and C is 
a constant (Ivanov & Leonine 1989). The Fourier basis obviously satisfy this condition 
because of (l/\/L)\ sm2irnx/L\ < C/\/Z, where C is independent of x and n. While the 
DWT basis is compactly supported (Appendix A), and its modulus does not satisfy the 
condition < C/\/L. Consequently, for the non-Gaussian fields, the one-point distributions of 
the Fourier variables \5 n \ could be Gaussian, while for the DWT variable e^, the one-point 
distributions show non-Gaussian (Pando & Fang 1998b.) 

2.2. The WFC covariance and DWT power spectrum 

In the DWT representation, the covariance {5(x)5(x')) is expressed by a matrix 
(€j,i€ji,i>) with subscripts (j, I); (j', I'). The elements of j = f, I = I' will be called diagonals, 
while j = j' called j diagonals, and j ^ j' the j off-diagonals. 

The Parseval's theorem for the DWT decomposition is (Fang & Thews 1998) 




j=0 ^ 1=0 



2 



(17) 
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which implies that the power of perturbations can be divided into modes, (J, I). \tj.i\ 2 
describes the power of the mode (J, I). One can then define the DWT power spectrum by 
the diagonals of the covariance matrix, i.e.0 



Pj,i = ni)- (is) 

Since the random variables ijj are complete, one can define a Gaussian field S(x) by 
requiring that all the variables ijj are distributed as a Gaussian process with the covariance 

and the zero ensemble average of all higher order cumulants of e^/. Thus, a Gaussian field is 
completely described by its DWT power spectrum P^\. For a homogeneous Gaussian field, 
the DWT power spectrum Pjj is /-independent, i.e. Pjj = Pj. 

Using eqs.(15) and (16), the covariance in the Fourier and DWT representations can 
be converted from one form to another by 

+oo V -12 j '-l 

(6Jl) = E E E (hiZ?,i')ki(n)$>A n ') ( 2 °) 

j,j'=0 1=0 l'=0 

and conversely 

+oo 

= E (44>i V (nO^(n). (21) 



n,n — — oo 



Therefore, for a homogeneous Gaussian field given by the DWT power spectrum Pj, 
eq. (20) implies 

(5JI) = P{n) V', (22) 

where 

(23) 



3 The DWT power spectrum, or called scalogram, has been extensively applied in signal 
analysis (e.g. Mallat 1999.) 
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In the derivation of eqs.(22), we used 

Ee- aM!/21 =<W, (24) 

Eq.(22) shows that for a homogeneous Gaussian Pj, the Fourier power spectrum P{n) is 
uniquely determined by the DWT power spectrum Pj. 

However, the reversed relation doesn't exist, i.e. one cannot show that the DWT 
covariance is given by eq.(19) with Pjj = Pj if the Fourier covariance is given by eq.(22). 
This indicates that the Fourier and WFC covariance are not equivalent. For instance, 
fields consisting of homogeneously distributed non-Gaussian clumps generally do not satisfy 
eq.(19) with a /-independent P 3 y, but do so for eq.(22). That is, eq.(19) with a /-independent 
Pjj places stronger constrains on the random field than eq.(22), and therefore, eq.(22) will 
hold when eq.(19) with a /-independent Pj : i holds, but not generally true for the converse. 

2.3. j off-diagonals of the WFC covariance 

We now identify the physical meaning of the j off-diagonal components of the WFC 
covariance. 

When the "fair sample hypothesis" (Peebles 1980) holds, or equivalently, the random 
field is ergodic, the 2? WFCs e,y, I — 0...2 3 ' — 1, for a given j can be taken as 2 J independent 
measurements, because they are measured by projecting onto the mutually orthogonal basis 
ipj^x). Accordingly, the 2 j WFCs form a statistical ensemble on the scale j. This ensemble 
represents actually the one-point distribution of the fluctuations of the DWT modes at a 
given scale j. The average over / is thus a fair estimation of the ensemble average. 

For a Gaussian field, these one-point distributions are Gaussian. However, even if 
the one-point distributions for all j are Gaussian, the density field S(x) could still be 
non-Gaussian. That is simply due to the statistical properties of the WFCs for indices 
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j and I are independent. It is easy to construct a density field S(x) for which the WFCs 
ijj are Poisson or Gaussian in its one-point distribution with respect to I, while highly 
non-Gaussian in terms of j (Greiner, Lipa & Carruthers 1995). A simple example is 
demonstrated as follows. Suppose the one-point distribution of the 2 j WFCs, on a scale 
j, is Gaussian. If the WFCs on the scale j + 1 is incorporated with those on the scale j, e.g., 

Cj+i,2i = atj,i, (25) 

£j+l,2Z+l = 

where a and 6 are arbitrary constants, the one-point distribution of the 2 J+1 WFCs e,-+i,z 
is also Gaussian. However, the coherent structure given by eq.(25) leads to a strong 
correlation between ij+ij and e^, i.e. the scale j + 1 fluctuations are always proportional 
to those on the scale j at the same position. This is a local scale-scale correlation. One can 
also design non-local scale-scale correlation by 

£3+1,21 = atj,i+Ai, (26) 
e?+i,2H-i — bijj+Ai, 

where Al = 1,2... Eq.(26) leads to a strong correlation between the fluctuations on scales 
j + 1 and j, but at two places with distance Al. 

Hence, in terms of the DWT representation, a homogeneous Gaussian field requires 
that (1) the one-point distributions of the WFCs with respect to / are Gaussian, and (2) 
the distributions of WFCs with different j's are uncorrelated, such as 

(e i+1)i €^) = 0. (27) 

Correspondingly, in the Fourier representation, a Gaussian field also has two 
requirements (1) the one-point distributions of the amplitudes of the Fourier mode \5 n \ are 
Gaussian; (2) the phases of 5 n are random. Therefore, eq.(27) is the DWT counterpart of 
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the Fourier random phase. However, it is difficult, or practically impossible, to capture the 
phase information of each Fourier modes. The local scale-scale correlation is overlooked 
with the Fourier covariance. 

In summary, the j off-diagonals of the WFC covariance provide the information of 
the scale-scale correlation. This non-Gaussian feature arises from mode-mode coupling of 
gravitational clustering, and cannot be measured by the higher order cumulants of the 
one-point distribution for a given scale j, rather, the cross correlation between the different 
scales. The covariance of a system without scale-scale correlation will be j-diagonal, i.e. 

(^V> = (cw>^V>=0, j^f, (28) 
where eq.(8) has been used at the last step. 

3. Statistical information extracting from the WFC covariance 

3.1. j-diagonalization of the WFC covariance 

It has been known that the DWT is powerful for data compression. For very wide 
types of stochastic clustering processes, the off-diagonal components of the covariance are 
strongly suppressed in the DWT representation. This suppression is especially efficient 
for selfsimilar clustering. For instance, one can show analytically that the covariance 
in the DWT representation is exactly diagonal for some popular hierarchical models of 
structure formations, such as the block model and its variants (Meneveau & Sreenivasan 
1987, Cole & Kaiser 1988). In this respect, the DWT basis represents the adequate normal 
coordinates. In other words, the DWT analysis can be understood as a Proper Orthonormal 
Decomposition (POD), or a Karhunen-Loeve transformation (e.g. Aubry et al. 1988), in 
regard to the second order correlations of these stochastic clustering processes. 
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For more realistic models and observed samples, the WFC covariance is not fully 
diagonal, but mostly j-diagonal. In fact, this character has been evident from the 
measurement of the fourth order scale-scale correlation in the observational samples such as 
the Lya forest lines (Pando et al. 1998), the transmitted flux of QSO absorption spectrum 
(Feng & Fang 1999) and the APM bright galaxy catalog (Feng, Deng & Fang 2000). A 
common conclusion is that the scale-scale correlations are very weak, and negligible on large 
scales, i.e. {Cj iCj' i>) = f° r 3 f an d j,j' < Jss, where J ss denotes the scale above 

which the scale-scale correlation is not significant. It is also true for the mass distributions 
and 2-D and 3-D mock catalog of galaxies in the CDM family of models (Feng, Deng & 
Fang 2000). This result indicates (e^ey^) = — for j ^ f and j,f < J ss . Of 

course, the typical scale J ss relies on the models or observational samples. 

Therefore, on large spatial scales, j < J ss , the WFC covariance is already j-diagonal. 
Within this range, the covariance matrix is decomposed into j sub-matrices (£j,i£j,i>)- This 
guide us to design the first statistics - the DWT band-power spectrum. 



Because the model-predicted power spectrum is currently expressed in the Fourier 
representation, any statistical estimator designed for measuring the power spectrum from 
real data should have simple relation with the Fourier power spectrum. 

Since we have only one realization of the cosmic mass field, no ensemble is available 
for each mode n. One cannot measure the Fourier power spectrum P(n), as it is from the 
variance of the amplitude \5 n \ of mode n. Generally, a power spectrum estimator is to 
measure banded power spectrum as 



3.2. The DWT band-power spectrum 




(29) 



n 
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where Wj(n) is a window function, which is localized in the n (or Fourier) space. The 
problem that arises here is, what is the criterion for a reasonable banding? and how to 
optimize the banded power spectrum? The DWT representation provides a natural and 
reasonable way for the banding. 

As discussed in §2.2, for an ergodic field, the 2? WFCs at a given j formed an one 
point distribution of the fluctuations at the scale j. Therefore, the DWT power spectrum 
at the scale j can be defined as the variance of the one-point distribution, i.e., 



Because of the zero mean of WFC (e,y), [eq.(8)]. Pj can be written as, statistically, 



which is an ergodicity- allowed spatial average of Pjj, and is usually referred as DWT power 
spectrum. As we will show below, Eq.(31) gives an estimator of band-average Fourier power 
spectrum. 

The DWT power spectrum eq.(31) is certainly less detailed than the power spectrum 
P(n) or Pj i. However, the numbers Pj are probably the maximum of statistically valuable 
band-power spectrum which can be extracted from one realization of an ergodic field. The 
optimum of this banding can be seen via the phase space {x, k}, where the wavenumber 
k = 2nn/L. Generally a set of orthogonal and complete basis of multiresolution analysis 
decomposes the entire phase space into elements with different shape, but their volume 
always satisfies the uncertainty relation, Ax • AA; > 2n. The ordinary Fourier transform is 
not a multiresolution decomposition, but always takes highest resolution of k, i.e. Ak — > 0, 
and lowest resolution of x, Ax — > oo. 

To apply the ergodicity, we chopped the survey volume L into pieces Ax. If Ax is too 
large, or Lj Ax too small, the ensemble contains few members, and thus there will be larger 




(30) 




(31) 



-17- 



vertical errors placed on the estimated power spectrum. In order to minimize this error, we 
may make the size of chopped pieces Ax to be small. Correspondingly, the width of window 
function Ak = In j Ax will broaden, and the scale resolution will be poor, i.e., there will 
be a large horizonal error bar placed on the estimated power spectrum. Thus, the optimal 
chopping can be achieved by a compromise between these two trade-off factors Lj Ax and 
Ak. Generally, 1/ Ax is proportional to the resolvable wavenumber, i.e. 

I/Ax oc k. (32) 

therefore, the optimized banding AkAx = 2n requires 

Ah 

— = Alnfc~l. (33) 
k 

That is, the optimized banding is in logarithmic spacing. To detect small scale fluctuations 
(larger wavenumber k), the size of the pieces Ax is chosen to be smaller. To detect large 
scale fluctuations (smaller wavenumber), the size of the pieces Ax is chosen to be larger. 
The wavelets ipj,i(x) is constructed by dilating (i.e. changing scale) of the generating 
function by a factor 2 J (Appendix A). Therefore, we have A In k ~ 1. In this sense, the 
DWT is an optimized multiscale decomposition (Farge 1992). Because the set of wavelet 
basis is complete, one cannot have more independent bands than Pj. 

Under the assumption of a homogeneous Gaussian field, the DWT power spectrum 
eq.(31) can be rewritten as 

1 oo 

P 3 = - £ \^n/y)\*P{n). (34) 

n=— oo 

where eqs.(ll), (21) and (22) have been used. Comparing with eq.(29), clearly, Pj is a 
band-averaged Fourier power spectrum with the window function 

W i {n) = hj>{nlV)\*. (35) 
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Generally, the function ip{n) is non-zero in two narrow wavenumber ranges centered at 
n = ±n p with width An p . Therefore Pj is the band spectrum centered at 

In rij = j log 2 + log n p , (36) 

with the band width as 

A\ogn = An p /n p (37) 

which stays constant logarithmically. Eqs.(36) and (37) show that the countable data set 
{Pj,j = 1,2...} represents scale-by- scale band-averaged Fourier power spectrum with the 
logarithmic spacing of wavenumber. Pj is completely determined by the Fourier power 
spectrum, and therefore, it should be effective for constraining the parameters contained in 
the Fourier power spectrum. 

The band-power spectrum (31) can also be written as, alternatively, 

Pj = Itr Cov{ v (38) 

where the matrix Cov\ v is the j submatrix of the covariance, i.e. 

Cov i,v = ( 39 ) 

Therefore, P/s exhaust all information of the j diagonals of the WFC covariance. Eq.(38) 
shows that we actually need not to diagonalize each j submatrix, as Pj is given by the trace 
of the j submatrix. 

3.3. Scale-scale correlations in second and higher orders 

In the range of j > J ss , the scale-scale correlations become significant, the DWT 
covariance will no longer be diagonal or j-diagonal. 

In this scale range, we should do somewhat diagonalization of the DWT covariance. 
However, the scale-scale correlation may lead to large errors of the diagonalization, even 
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the diagonalization becomes impossible. Let us consider the example of the scale-scale 
correlation given by eq.(25). In this case, the variable €j+i-i actually is linearly dependent 
on €jj + Ai, and therefore the matrix (^j+i,i^j,i>) is singular. It cannot be diagonalized. For 
instance, for scales j = 1,2, the covariance matrix now is 



( e e 

e l,O e l,0 


e l,0 e 2,0 


e l,0 e 2,l 






1 


a 


b 


\ 


e 2,0 e l,0 


e 2,0 e 2,0 


£2,0 e 2,l 


- f 2 

— fc l,0 




a 


a 2 


ab 




V e 2,l e l,0 


e 2,ie2,0 


62,162,1 J 






b 


ab 


b 2 


J 



Obviously, this matrix cannot be diagonalized. 

More seriously, if the matrix elements have some uncorrelated errors due to 
measurements, i.e. Ijjtyy =t the matrix (40) looks diagonalizable. However in 

this case the minors of the matrix are given by the errors Ae^/p, and therefore, the 
diagonalization will be largely contaminated by the errors. 

This example indicates that when the scale-scale correlations appear, the number of 
the independent variables, and then the signal-to-noise ratio, will decrease, we should not 
extract the statistical properties of the covariance by a diagonalization. 

Fortunately, our ultimate goal is not the mathematical diagonalization, but 
discrimination among physical models of the structure formation. An alternative to the full 
diagonalization is to take the following two measures: (1) Using the j-diagonals of each j 
to calculate the band-power spectrum Pj [eq.(31)]; (2) using the j off-diagonals to calculate 
the second order scale-scale correlations. The second order scale-scale correlations is defined 
as 

QAM) = ^E^-'^ j>f, (4i) 
1 1=0 

V = mod[l/2 j - j ']+ Al. 

Like the band-power spectrum [eqs.(30) and (31)], Cjji(Al) is defined by an ergodicity- 
allowed average. Cjj'(Al) measures the second order correlation between fluctuations 
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on scale j and f at positions / and I'. Since cosmic density field is homogeneous, the 
correlation depends only on the difference between / and I', i.e. A/L/2- 7 '. For an initially 
Gaussian field, the scale-scale correlations are developed during the non-linear evolution of 
the gravitational clustering. 

Now, we can use the two statistics Pj and Cjj> to discriminate among models. Actually, 
the two statistics discrimination would be more worth than the full diagonalization. 
For instance, the model-predicted galaxy power spectra on smaller scales are generally 
degenerate with respect to cosmological parameters, i.e. models with different cosmological 
parameters can yield the same galaxy power spectrum. This is because one always can 
choose the bias model parameters to fit the prediction with the observations. Therefore, 
to remove the degeneracy, an independent measure for constraining the bias models is 
necessary. The scale-scale correlation is found to be sensitive to the bias model (Feng, 
Deng & Fang 2000). Thus, for model discrimination, the j-diagonal power spectrum plus 
scale-scale correlation would be more useful than a full-diagonalization. 

In a word, in the scale range of j > J ss , we will extract the valid statistical information 
from the covariance by Pj and Cjji(Al). 

It should be pointed out that even when all Cjj'(Al) vanish, one cannot conclude that 
the system is scale-scale uncorrelated. In other words, that a decomposition X-i yields a 
diagonal covariance doesn't mean that the modes Xi are really statistical uncorrelated. 
There are many clustering models which have diagonal covariance, but mode-mode 
statistics are correlated on higher orders (Greiner, Lipa & Carruthers 1995.) A diagonal 
decomposition means only that mode-mode is uncorrelated on second order. 

The higher order generalization of Cjj'(Al) is straightforward. For instance one can 
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measure the fourth order scale-scale correlations by 
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chm = 27E^'» j>f > (42) 



/' = mod[l/2 j - j ']+ Al. 

This correlation C| -/(AZ = 0) is essentially the same as the so called band-band correlation 
defined by 

T = , (43) 

It has been shown that the precision of the Fourier band-power spectrum estimator depends 
on the band-band correlation T (Meiksin & White 1998.) In the DWT representation, we 
arrive at the similar conclusion that when Cjj'(Al) or -/(AZ) are non-zero, i.e. when the 
DWT covariance is not j diagonal, we should test models by both the band-power spectrum 
and scale-scale correlations. For samples of large scale structure, the scale-scale correlations 
Cjj,(A = 0) has been found to be significant on scales less about 10 h^ 1 Mpc (Pando et al 
1998, Feng, Deng k Fang 2000.) 



4. The DWT algorithm of data binning 

In the following two sections, we will discuss the algorithm for estimating the band 
power spectrum Pj and scale-scale correlations Cjj>(Al) from galaxy redshift surveys, and 
other samples of large scale structures. 

If the position measurement is perfectly precise, the observed galaxy distribution can 
be written as 

p 9 (x) = Y,w i 5 D (x-x i ), (44) 
i=i 

where N g is the total number of galaxies, {x,} the position of the i-th galaxy, < xt < L, 
Wi its weight, and 5 D is the Dirac-5 function. However, the position measurement has error 
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due to finite spatial resolution, and therefore, the distribution usually is somewhat given by 
a binned histogram. 

The binning is performed by a convolution of the data with a binning function W(x) as 



in which H(x) is the sampling function defined as H(x) = Y.i^ D {x — lL/2 j ), where 
I labels the Z-th bin. Obviously, the mesh-defined density distribution is given by 
p 9 (x) = J2iP®8 D ( x — IL/2*), where p\ = j'W{lL/2^ — x')p 9 (x')dx' is a mass assignment at 
the Z-th bin. 

It is well known that the binning eq.(45) will result in spurious features of the Fourier 
power spectrum on scale around the Nyquist frequency of the FFT grid (e.g. Jing 1992, 
Percival & Walden 1993, Baugh & Efstathiou 1994). Mathematically, eq.(45) implies a 
decomposition by the weight function W(x). In other word, W(lL/2i — x') are playing the 
role of a scaling functions (or sampling function.) If the scaling functions are orthogonal 
and complete, the one cannot recovered the original field without distortion. This may 
cause some spurious features, such as the aliasing effect in the FFT. In the DWT analysis, 
the binning or sampling are always done by an orthogonal and complete decomposition, one 
can expected that the spurious features and false correlations can be completely avoided. 



The WFCs e,y are assigned at regular grids I — 0...2- 7 ' -1 . It is actually a binning of 
data. In this case, the binning is automatically realized by the orthogonal projection onto 
wavelet space, and no extra weight function is required. In result, the contamination due to 
the sampling error is naturally eliminated. 




(45) 



4.1. Binning with wavelets 
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With eq.(6), one can directly calculate the WFCs of the galaxy distribution (44) by 



Since we used the periodized distribution S(x) in eq.(6), the discontinuity between the 
data at two boundaries may introduce false coefficients. Yet, this possible false signal is only 
related to boundaries. One can expected that this false coefficients will not be important 
for detecting power spectrum on scales much less than L. This boundary effect has been 
tested numerically by using simulated samples over a finite length divided in 512 bins with 
two different boundary conditions (A) periodic boundary conditions; (B) zero padding. The 
results show that the spectrum can be correctly reconstructed by the DWT regardless of 
the boundary conditions on scales equal to and less than 64 bins (Pando & Fang 1998). 

Note has to be taken of the difference between usual mass assignment and the DWT 
projection (46). In the former, the mass assignment is given by partitioning the mass on the 
grids according to the binning function W(x), and the binning data are the mesh-defined 
densities. Whereas for the DWT projection, the binning data, i.e. the WFCs are not 
the mesh-defined densities, but the fluctuations on scale j at position I, which is obviously 
not positive-definite. 



In the DWT analysis, the mass assignment is realized by the scaling function (f)jj(x) 
[eq.(A30)]. Besides the orthogonality eqs.(A33) and (A34), the basic scaling function <f>(r)) 
(which is not yet periodized!) satisfies the so-called "partition of unity" as (Daubechies 



N 9 



(46) 



i=i 

The errors of ef , can also be calculated from the errors of X;. 



4.2. Binning with scaling functions 



1992) 



OO 



E M-0 = i- 



(47) 



l=— oo 



One can also define the periodized scaling function as 

/ 2 j\V2 oo 



Thus, eq. (47) can be rewritten as 



2=0 ^ 



We will only use the periodized scaling function below, and drop the superscript P. 
With the periodized scaling function, the eqs.(A39) - (A41) give 

oo 2^-1 

P (x) = P J (x) + J2 Yl hrtjA*), 

j=J 2=0 

where 

2 J -1 

p J ( x ) = E e J,i<pj,i( x )- 

1=0 

The scaling function coefficients (SFCs) ejj is given by 

ejj = f p(x)(f)j : i(x)dx 

J 

Subjecting the distribution (44) to the transform eq.(50), we have 

2 J -1 oo 2^-1 

p 9 ( x ) = e J,i<t>j,i( x ) + EE 

2=0 j=J 2=0 

where 

i=l 

Using eqs.(44) and (54), eq.(49) yields 



2-?-l 



L 



2=0 ^ i=l 

This shows that the i-th galaxy is assigned onto grid / by number (L/2 : >)wi(f)jj(xi). 
Therefore, the SFC (L/2 j )e g j l is the mass assignment of p 9 (x). 
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4.3. The DWT binning and FFT 

Given a galaxy distribution eq.(44), its Fourier transform is evaluated by the 
trigonometric summation 

p9(n) =J2w t e i2nnx * /L , (56) 
i=i 

and the power spectrum is \p 9 {n)\ 2 . However, the power spectrum given by the FFT of 
p 9 (x) [eq.(45)] is 

oo 

\p l (n)\ 2 = Y, \W(n + Vn f )\ 2 \p 9 (n + 2 j n')\ 2 (57) 



n '=— oo 



where W(n) is the FT of the binning function W(x). The power spectrum (57) is 
obviously not equal to the power spectrum |p 9 (n)| 2 . The power spectrum (57) is given by 
a superpositions of the power spectrum \p 9 (n + 2 j n')\ 2 on all scales n + 2 j n'. This is the 
"aliasing" effect (Hockney & Eastwood 1989, Hoyle, et al. 1999). 

In the DWT representation, the FT of eq.(53) yields 

2 J — 1 oo 2 J — 1 

P 9 (n) = E t 9 J,iki(n) + E E ~^M n ) (58) 

1=0 j=J 1=0 

where the function 4>j,i(n) is the Fourier transform of <j>j t i(x), i.e. 

/•oo 

4>iM) = / 4> 3 ,i{xY~ l2 ™ X,L dx. (59) 



Using the definition of 4>j,i(x) [eq.(A30)], eq.(59) becomes 

ki^) = [j) V2 fcn/2>)e-*™V* (60) 
where (j>(ri) is the Fourier transform of the basic scaling function (j>(rj) 

(f»{n) = / <j)(r])e- l27Tn ^dr]. (61) 

J — oo 

Eq.(58) gives then 

/riJX- 1 / 2 2 J -1 oo /oj\ -1 / 2 2J-i 

P s (n) = ^ T J #(n/2 J ) E e%e~ M / 2J + E (^J 0(n/2 J )^(n/2^) £ e >^/ 2J , 

(62) 
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Since t[)(n/2 j ) is localized in n/2 j ~ n p , the second terms in the r.h.s. of eq.(62) are 
important only for n > 2 J n p . Thus, the Fourier transform p 9 {n) can be evaluated by 

p3{n) = (f)(n/2 J )F(n/2 J ), n < 2 J n p (63) 

where 

/ 2 j\ -1/2 2^-1 

F(n/2 J ) = ^-J J2 £J,ie- t2nnl/2J ■ (64) 

F can be calculated by the standard FFT technique. Therefore, the FT of the galaxy 
distribution p 9 (x) can be evaluated directly by FFT of its SFC mass assignment e 9 j V 
Eqs.(63) and (64) is actually a scale-adaptive FFT for estimating the power spectrum of an 
irregular data set. This algorithm computes p 9 (n) up to the scales n < 2 J n p , where the 
adapted scale J can be chosen as high as the scales to be studied. 



5. The DWT algorithm on the Poisson sampling 

The observed or the mock galaxy distributions p 9 (x) are considered to be a Poisson 
sampling with an intensity p M (x) = p(x)[l + S(x)], where p(x) is the galaxy distribution if 
galaxy clustering is absent, and given by the selection function (Peebles 1980). A proper 
power spectrum estimator should be effective to obtain the power spectrum debiased from 
the Poisson sampling. It has been realized that, to handle the Poisson sampling with a 
non-uniform selection function, the decomposition basis ipi{x) [eq.(l)] is required to have 
zero average (e.g. Tegmark et al. 1998), i.e. 

J ipi(x)dx = 0. (65) 

This is what we can take the advantage of the DWT analysis, as for the wavelets ipj^x), 
eq.(65) always holds due to the admissibility [eq.(7)]. 



-27- 



5.1. Algorithm for the DWT covariance affected by Poisson sampling 



Considering the Poisson sampling, the characteristic function of the galaxy distribution 



p y {x) is 



Z [ e iJX*Mx)«fa] = exp |y dxp M {x)[e iu{x) - 1]| , 
and the correlation functions of p s (x) are given by 



(66) 



(p°( Xl )..y(x n )) P = - 



5 n Z 



Su(xi)...Su(x n ) 



(67) 



u=0 



where (...)p is the average for the Poisson sampling. We have then 



(p 9 (x)) P = p M (x), 



and 



(p 9 (x)p 3 (x'))p = p M (x)p M (x') + 5 D (x - x')p M (a;) 



(68) 



(69) 



This equation yields 



(6(x)5W) = 1 + 



(p 9 (x)p 9 (x')) 



p \ XD 



5 D (x - x') 



(70) 



p(x)p(x') I p(x) 

Since p(x) is not subject to a Poisson process, the second term of the r.h.s. of eq.(70) can 
be rewritten as (([p 9 (x)/p(x)][p 9 (x')/p(x')])p). Using eq. (44), we have 



N a 



p 9 (x) ^ 1 

-2^——Wid (x-Xi). 



(71) 



p{x) ^ 

in which the factor p(xi) can be absorbed into the weight factors Wi. The WFC covariance 
is given by 



p(x) 



dx. 



(72) 



The first term in r.h.s of eq.(70) disappears as all the basis functions ipj,i(x) are admissible 
[eq.(7)]. 
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5.2. The estimators for the DWT band power spectrums 

If the selection function varies slowly on a scale j, i.e. 

d\np(x) 



dx 

we have approximately, 



< 2>/L, (73) 



/ 



MMfA*) ^ 1 6 ■ , (74) 
p(x) p(x ? ) 

where is the number density of galaxies averaged over a volume of L/2 J at /. In this 
case, the band-power spectrum is simplified as 

*-f£«^-htm- (75) 

The second term in the r.h.s. is the variance from the Poisson process. Since the Poisson 
process does not change the ergodicity, the average over / in eq.(75) is already a fair 
estimation for the ensemble average. Therefore, one can drop ((...) p) in eq.(75), and the 
estimation of the DWT band power spectrum is given by 



P, 



2^^)' (76) 



The second term is for subtracting the contribution of the discreteness effect (or shot noise) 
in the Poisson sampling from the power spectrum. Pj is debiased from the Poisson process. 



5.3. The estimators for the scale-scale corrections 

Similarly, one can calculate the debiased scale-scale correlations from a galaxy sample 
p 9 (x). From eq.(70), the term of the Poisson process is free from scale-scale correlation, 
the second order scale-scale correlation can be calculated from the WFCs of the galaxy 
distribution without the correction for the shot noise 

j 2*'-l 

CjA^) = ^E < : '/' r , ■ .±r J > ?■ (77) 
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However, the Poisson process is not free from higher order scale-scale correlations. For 
instance, to estimate the band-band correlations eq.(42), we use eq.(67) with n — 4. It gives 

V-i 



c 



3,3 



1 

2^ 



e 

1=0 

fey 



(78) 



M x )^j'A x ) dx f ^3A x ')^3',i'( x ') dx , 

p(x') 

2 j -l „ „/,2 



where j > j' and /' = mod[l/2 J ~ J '] + A/. The last three terms are the scale-scale correlations 
Cjj, from the Poisson sampling. Exactly, the factor p(x) in the Poisson terms should be 
p M (x) = p(x)[l + S(x)], but we ignored the contributions of 5(x) at the moment. 

If the selection function is slowly varying on scales j and f [eq.(73)], we have 



r 2 - 



23-1 



E(^ 2 ^ 1 



c 3';i' 



(79) 



1=0 
23-1 



1 



y - 

p{x{)p(x v 



23-1 

E 

1=0 



p 3 (x) 



dx 



The second and third terms correct for the shot noise on the 4-th order. Numerical results 
showed that for typical samples of galaxy survey the local {V = I) scale-scale correlation of 
the Poisson sampling is significant on small scales (Feng, Deng & Fang 2000.) 



6. Discussions and conclusions 

We presented the method of extracting the band-power spectrum from observed data 
and simulation sample via a DWT multiresolution decomposition. The DWT scale-by-scale 
approach provides a physical insight into the covariance matrix of the cosmic mass field. 

A key indicator of the DWT power spectrum estimator is the scale-scale and/or the 
band-band correlations, which can be calculated directly from the DWT covariance and 
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the WFCs. In the scale range that the scale-scale correlations are negligible, the DWT 
covariance is j(scale)-diagonal, and it is already a lossless estimation of a banded power 
spectrum Pj. This DWT band power spectrum is optimized in the sense that the spatial 
resolution is adaptive automatically to the scales of the density perturbations. 

In the scale range that the scale-scale (or band-band) correlations are significant, the 
diagonalization of the covariance may not yield an accurate power spectrum, but seriously 
contaminated by errors. In this case, an effective confrontation between the observed sample 
and model-prediction may not be given by a full diagonalized covariance, but both of the 
DWT power spectrum and scale-scale correlations. With the DWT representation, one can 
calculate the scale-scale correlation as well as the DWT power spectrum. Therefore, the 
DWT covariance is also useful when scale-scale correlation is strong. 

In summary, the basic DWT algorithm is proceeded in the following steps, 

1. Calculation of the WFCs e 9 Jl and/or the SFCs e 9 3l from the data p 9 (x), where J 
corresponds to the highest resolution of the samples. 

2. Calculation of the WFCs for various scale j. 

3. Calculate the band-power spectrum Pj, and scale-scale correlations Cjji. 

4. In the j range of Cjji ~ 0, testing models or constraining parameters by comparing 
the model-predicted DWT band-power spectrum Pj with observed results. 

5. In the j range of C^y ^ 0, testing model or constraining parameters by comparing 
the model-predicted DWT band-power spectrum and scale-scale correlations with 
observed results. 

Since the DWT is computationally powerful, the above-mentioned algorithm is found 
to be numerically efficient and flexiable (Yang et al. 2000.) Moreover, the developed 
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method is open in the sense that based on the WFCs and SFCs one can add subsequent 
items to realize the further goals related to the power spectrum measurement and model 
discrimination. Some of these problems are discussed below. 

6.1. Higher dimensions and complex geometry 

The DWT analysis in a 2 and/or 3-D space x can be performed by the bases of the 
1-D bases direct product, i.e. 

Ah,j2,j3Uh,h,h)( x i, x 2, x 3) = ^ht&MhhfaWh&M- ( 80 ) 

In this case, the three scales (ji, J2, J3) of the WFCs can be different for different directions. 
One can define radial scales by 

*-[e) ,+ o ,+ (£)r- 

where L\ x L 2 x xL 3 is the 3-D box. 

For 2 and 3-D samples, one can also decompose by the mixed direct product of 1-D 
wavelets and scaling functions. For instance, a 3-D sample can be decomposed by bases 

^£j3),(!i,! a ,J3)( a; i> x 2, a*) = <f>n,h(xi)ip j2 , h (x 2 )ip j3 ,i 3 (x 3 ). (82) 

where the scaling functions 0^ actually play the role of chopping a 3-D sample into 2 jl 2-D 
slices in the x\ direction, l\ = 0, ...2 jl — 1. Like the binning by the scaling function (§4.2), 
the chopping eq.(82) will not cause spurious features. 

The problem of complex geometry of samples can be treated by using the locality of 
the ijjjj (Pando & Fang 1998a). The locality property allows the WFCs to be independent 
of the data outside an "influence" cone. The WFCs ijj is only determined by data in the 
interval [(IL/V+ 1 - (Ax)/2^ +1 , (IL/T +1 + (Arr)/2^ +1 ], where Ax is the width of the basic 
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wavelet if). With this property, any complex geometry of samples can be regularized into a 
2 or 3-D box by zero padding in the field between the sample geometry and the box. Since 
all WFCs at the zero padding zone are zero, one can use the DWT to analyze the regular 
box, but not treat the WFCs related to the zero padding as the variables of valid degrees of 
freedom .0 

6.2. Non-Gaussianity and power spectrum detection 

We have emphasized that the information of the non-Gaussian features are important 
for a precise detection of the power spectrum, or band power spectrum. That is because, 
from the covariance, one can only find statistically uncorrelated (or statistical orthogonal) 
bases or modes on second order. For non-Gaussian fields, the modes statistically 
uncorrelated on second order might be statistically correlated at the 3rd and 4th orders. On 
the other hand, the power spectrum is of second order, and therefore, the power spectrum 
estimates at different scales might not be statistically uncorrelated if there are 3rd and 4th 
order correlations. The accuracy of a power spectrum estimation is affected by the higher 
order statistical correlations. 

For instance, a popular bias model for galaxy formation employ the selection probability 
functions as (Cole et al. 1998) 

(83) 

where a is const, and 5 s (r) and a s are smoothed density field and variance. Therefore, if the 
density field is Gaussian, the galaxy distribution given by the Poisson sampling with the 



P(8(t)) oc exp 



a- 



OS 



4 About DWT on manifold, see also W. Sweldens 
|http: / / www.wavelet.orp 



http:/ /cm.bell-labs.com/who/wim 



or 
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intensity eq.(83) will be lognormal. The baryonic distribution is sometimes also modeled 
by a lognormal relation with the underlying Gaussian mass field (Bi, Ge & Fang 1995, Bi 
& Davidsen 1997). As having been well known, for lognormal distribution, the most likely 
value can be significantly different from their mean value. In this case, to estimate the 
accuracy of a power spectrum detection, the higher order cumulant statistics is needed. 

In the DWT analysis, the 2 j WFCs give the one point distribution of the fluctuations 
on scale j. Therefore, the third and forth cumulants can be calculated by 

^ = i^£W^) 3 ., (84) 
Pj' 13 1=0 

*i = 4^ Efoi " " 3 (85) 

^3 1 1=0 

These are, respectively, the skewness and kurtosis spectra. It is not difficult to generalize 
eqs.(84) and (85) to more higher orders. 



6.3. Selection of the basis of the multiresolution analysis 

In computing the samples of redshift surveys, there are two coordinate systems having 
been widely used: 1. parallel plane system; 2. spherical shell system. For system 1, the 
volume of the survey can be approximated as a box, and therefore, the wavelets of eqs.(80) 
and (82) are suitable for the decomposition. For the system 2, we should use the wavelets 
on 2-D spherical surface. With the development of the DWT analysis, the bank of the 
DWT analysis has stored more and more sets of the orthogonal and complete basis for the 
multiresolution decomposition of different geometries. The multiscale analysis on geometry 
beyond above-mention two simple cases is being feasible. 



6.4. Systematic effects 



The influence of various systematic effects on the power spectrum detection has only 
been studied very preliminarily. The linear effect of redshift distortion on the power 
spectrum detection has been well studied (e.g. Hamilton 1995). It is not difficult to 
incorporate the linear theory of the redshift distortion with the DWT analysis. A key 
operator of the mapping a real space distribution into redshift space is (1 — a(d 2 /dz 2 )V~ 2 ), 
where coefficient a is const. To diagonalize this differential-integral operator, the Fourier 
representation is certainly the best. However, it has been shown that this operator is 
quasidiagonal in the DWT representation (Farge 1996). 

Moreover, it would be straightforward to include a scale-dependent bias in the DWT 
representation. The redshift distortion is usually calculated under the assumption that 
the galaxy distribution p s (x) is linearly related to the underlying mass field p(x), i.e. 
pP(r) = bp(r), where b is the bias parameter. However, observations have indicated that 
the bias parameters probably are scale-dependent (Fang, Deng & Xia 1998.) It is easy to 
introduce scale-dependent bias in the DWT representation. For instance one can define a 
bias parameter on scale by = bjijj. 

LLF acknowledges support from the National Science Foundation of China (NSFC) 
and World Laboratory Scholarship. This project was done during LLF's visiting to the 
Department of Physics, University of Arizona. This work was supported in part by the 
LWL foundation. We thank anonymous referee for helpful comments. 
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A. The discrete wavelet transform (DWT) of density fields 

Let us briefly introduce the DWT analysis of the cosmic mass density fields, for the 
details of mathematical stuffs refers to the classical papers by Mallat (1989a,b,c); Meyer 
(1992); Daubechies, (1992) and references therein, and for physical applications, refers 
to Fang & Thews (1998) and references therein. Some other cosmological applications of 
wavelets can also be found at, e.g., Pando, Vills-Gabaud & Fang (1998), Hobson, Jones & 



Lasenby (1999), Sanz et al. (1999), Tenorio et al. (1999), Xu, Fang, & Wu (2000), Cayon, 
et al (2000). 



We consider here a 1-D mass density distribution p(x) or contrast 5(x) = [p(x) — p]/p, 
which are mathematically random fields over a spatial range < x < L. It is not difficult 
to extend all results developed in this section into 2-D and 3-D because the DWT bases for 
higher dimension can be constructed by a direct product of 1-D bases. 

First, we introduce the scaling functions for the Haar wavelets. There are top-hat 
window functions defined by 



gives a window at resolution scale Lj2? and position L/2^ < x < L{1 + 1)2 _J . With 
the scaling function, the mean of density contrast distribution in the spatial range 
Ll2~ j < x < L(l + 1)2~ J can be expressed as 



A.l. Expansion by scaling functions 




(Al) 



where the superscript H is stand for Haar. The scaling function, <f)fi{x) actually 




(A2) 



-36- 



The number is called the scaling function coefficient (SFC). Using SFCs, one can 
construct a density contrast field as 



= £ e^x). (A3) 
1=0 

This is the density contrast 8(x) smoothed on scale L/2 j , or for simple, j-scale. 
The scaling function <f>fi(x) can be rewritten 

<l»*(x) = <fi B (Vz/L-l), (A4) 

where 

f f for < 77 < f 
<P H (V) = " (A5) 

I otherwise. 

j, I are integers, with j > 0, and < I < 2 3 ' — 1. 0^(77) is called the basic scaling function. 
The scaling function (f)fj(x) is thus a translation and dilation of the basic scaling function. 

The functions (/>fi(x) are orthogonal with respect to /, i.e. 

J%f^)<PfA x ) dx = |<V (A6) 

where 5ij> is Kronecker delta function. Thus, eq.(A3) gives functions in the function space 
Vj spanned by bases (f)fj(x). Vj is a closed subspaces of L 2 (R), i.e. Vj C L 2 (R). It is easy to 
show that 

4>Ux) = <f>f +1 , 2l (x) + <f>f +1 , 2l+1 (x) (A7) 

e j,i = ^j+i^i + ej+i,2Z+i)- (A8) 

Therefore, Vj C Vj+i for all j. Thus, the orthogonal projectors Pj onto V}, i.e. Pjf G Vj, 
satisfy 

lim Pjf = /, (A9) 
for all / G L 2 (R). A multiresolution analysis is then defined by the sequence of subspaces 



-37- 



A.2. Expansion by wavelets 

Eqs. (A7) and (A8) show that 5 j (x) contains less information than 5 j+1 (x), because 
information on scale j + 1 have been smoothed out by eq. (A8). It would be nice not to lose 
any information during the smoothing from j + 1 to j [eq.(A8)]. This can be accomplished 
if the differences, S j+1 (x) — 5 j (x), between the smoothed distributions on succeeding scales 
are somehow retained. This is, if we are able to retain these differences, this scheme will 
then make it possible to smooth the distribution and yet not lose any information as a 
result of the smoothing. 

To calculate the differences, we define the difference function, or wavelet, as 

1 for0<r?<l/2 
1> H (v) = \ -1 forl/2<??<i (A10) 
otherwise. 

This is the basic Haar wavelet. As with the scaling functions, one can construct a set of 
wavelets ip^i(x) by dilating and translating eq.(AlO) as 

^ l (x)=^ H (2>x/L-l). (All) 

The Haar wavelets are orthogonal with respect to both indexes j and /, i.e. 

j%f,^fM dx = (|) <WiM- (A12) 

For a given j, ip? t (x) is also orthogonal to the scaling functions (/>fi i(x) with j' < j, i.e. 

£ ( f>f l ^x)^ l (x)dx = 0, if j'<j. (A13) 

From eqs.(A4) and (All), we have 

(A14) 
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Thus, the difference S j+1 (x) — 5 j (x) is given by 



2^-1 



V +1 (x) - &{x) = £ e^f-i,*Or), (A15) 



where ej-ij are called the wavelet function coefficients(WFC), which is given by 

2' 



= ^ / <K^5(^- (A16) 
Using the relation (A15) repeatedly, we have 

6*(x) = 5°(x) + £ £ ; '/./'• •;'/(.'•)• (A17) 

j'=o ;=o 

This is an expansion of the function 5 j (x) with respect to the basis ip^(x), and 5°(x) is the 
mean of S(x) in the range L. We have <5°(a;) = if S(x) is density contrast. Considering 
(A9), for any f(x) £ L 2 (R) in L with mean / = we have 

oo V-l 

/W=EE*)> (A18) 
i=o «=o 

and 

hi = jJ Q L f(^f,idx. (A19) 

For a given j, the wavelets i])fi{x) form a space Wj which is the orthogonal complements 
of Vj in Vj+i, i.e. Vj+i = Vj © Wj. Thus, every p e Vj has a unique decomposition 
/i = /i- 1 + rfi- 1 with /J'" 1 G and cP'" 1 e W^. Since Wj C V)+i and Wj is orthogonal 
to V^, Wj is also orthogonal to Wj_i and Thus, all the spaces Wj are mutually 

orthogonal. Since Vj contains only Wj> with j' < j, Vj is orthogonal to all Wj> with j' > j. 



A. 3. Compactly supported orthogonal basis 

In terms of the subspace Vj, the basic scaling function (f)(r)) and basic i]){rj) belong to 
Vq and Wq respectively, and they can be expressed by the basis of Vi, 4>(2r] — I), i.e. 

oo 

0(77)= aiWv-l), (A20) 
i=— 00 
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Hv)= E bi^rj-l), (A21) 

l=— oo 

where a\ and b\ are called the filter coefficients. 

If we require that the scaling function (f)(rj) is normalized, eq.(A21) yields 

E^ = 2 - ( A22 ) 
i 

Requiring orthogonality for (f){x) with respect to discrete integer translations, i.e. 

/oo 
4>(r] - m)(f)(r])dr] = 5 mfi , (A23) 
-oo 

we have 

E a«a; +2m = 25 , m . (A24) 
The orthogonality between and ^ means 

/oo 
^(77)0(77 - Z)rf?7 = 0. (A25) 
-00 

Therefore, one has 

k = (-l)W (A26) 
Furthermore, the wavelet ip(rf) has to be admissible 

/+00 
^( V )d v = 0, (A27) 
-00 

so we need 

E 6 ' = °- ( A28 ) 

The conditions (A22), (A24), (A26) and (A28) for the filter coefficients were employed 
to construct families of scaling functions and wavelets. The simplest solution of the filter 
coefficients is ao = ai = b = —b\ = 1 and all others 0. This solution gives the Haar wavelet. 
After the Haar wavelet, the simplest solution for the filter coefficients is 

oo = (1 + V3)/4, ai = (3 + V5)/4, (A29) 
a 2 = (3 - V3)/A, a 3 = (1 - >/3)/4. 
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This is the Daubechies 4 wavelet (D4). It is compactly supported and continuous. 

With these wavelets, the multiresolution analysis can be performed in the similar way 
as developed in last two sections for the Haar wavelets. The scaling functions and wavelets 
for spanning the subspace Vj and Wj are given, respectively, by a translation and dilation 
of the basic scaling function and basic wavelet 

/ 2 j\ 1/2 

<t>jAx) = [j) <f>(2 j x/L-l) (A30) 



and 



1>jA*)=[l) ^x/L-l). (A31) 



The wavelets are orthonormal, i.e. 

J $3A x )$?A x ) dx = &jj'Kv- ( A32 ) 

Eqs.(A23) and (A25) yield also 

J (j) ji i(x)(j) jt i / (x)dx = 6i,i>, (A33) 

and 

J 4>^{x)^^ v {x)dx = f > j. (A34) 

The set of ipjj and (fio tm ( x ) with < j < oo and — oo < l,m < oo form a complete, 
orthonormal basis in the space of functions with period length L. 

Thus, a density field p(x) with period length L can be expanded as (Fang & Thews 
1998) 



oo oo 



p(x)=p + pJ2 ^^(XK (A35) 

j=0 l=—oo 



or the density contrast S(x) = (p(x) — p)/p is 



oo oo 



S(x) = E E h#l>iA*\ (A36) 

j=0 l=-oo 
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where 

rL 



10 

and 



p = L~ l [ p(x)dx (A37) 
Jo 



DC 



e j: i = I 8(x)ip ji i(x)dx. (A38) 

More generally, we have 



oo +00 



p(x)=p J {x)+pJ2 E ^^(x), (A39) 

j=J l=—oo 

where p J (x) is the density field smoothed on scale J 

+00 

p J (%) = E zjAjA*)- (A40) 

i=— 00 

and the scaling function coefficient (SFC) ej^ is given by 

/+00 
p(x)(f)jj(x)dx. (A41) 
-00 
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